The language components in Verbmobil
نویسنده
چکیده
This paper gives an overview over the main problems and their solutions in the language components of the Verbmobil speech translation system. Interpretation of spontaneously spoken language has to take into account that syntax and semantics differ from written language, that punctuation is missing, that accent and intonation have effects on the meaning and the translation, that the output of the speech recognizer may be noisy and that speakers produce errors due to distraction. The Verbmobil interpretation and translation components try to attack these problems by means of a grammar for spoken language, heavy use of prosodic information, a syntactic search on word hypothesis graphs and a shallow robust fall back translation device that is used in case the „deep“ translation fails. 1. PROBLEMS TO BE SOLVED Syntactic and semantic processing of spontaneously spoken language is faced with problems that differ dramatically from those posed by the processing of written text. The special problems that arise from spoken input can be grouped into five distinct sets of problems: 1. Spoken language differs from written language both in syntax and semantics [1]. In spoken German you find e.g. constructions like the so called „ellipsis of the Vorfeld“ (Paßt mir nicht so gut [That doesn’t suit me]), extraposition of arguments and adjuncts (Wie sieht es aus am Dienstag? [How about Tuesday?]) and dislocation of semantic groups (Ich möchte um 2 Uhr einen Termin machen [I’d like to make an appointment at 2 o’clock.]). 1 The work described in this paper was partially funded by the German Federal Ministry for Research and Technology (BMBF) in the framework of the Verbmobil Project under Grant 01IV102AO. The responsibility for the contents of this paper lies with the author. 2 This paper reports on work done by many people at different sites, namely the DFKI, the IAI and the Universities of Saarbrücken, Stuttgart, Tübingen, and Erlangen-Nürnberg and at Siemens. 2. There is no punctuation. An utterance like wie sieht es aus am Dienstag um 17 Uhr geht es nicht can therefore be translated by either of the following utterances: [How is it?] [On Tuesday at five p.m. it is not possible.], [How about Tuesday?] [At five p.m. it is not possible.] or [How about Tuesday at five p.m.?] [Isn’t that possible?]. 3. Different sentence stress or intonation may yield a different semantics and a different translation. Whereas the sentence wir brauchen noch einen TERMIN should be translated by we (still) need an appointment, the same sentence with stress on noch (wir brauchen NOCH einen Termin) should be translated by we need another appointment. 4. The output of the speech recognizer is noisy. Even with good recognizers it appears quite often that the most probable recognition result does not correspond to what the speaker has said, e.g. said dann bin ich nämlich in Münster, understood dann bin ich nehme ich in München. 5. The speaker’s utterances are sometimes errorneous. By „errorneous“ we do not mean here cases where a speaker does not obey the rules of a normative grammar. These cases fall under problem 1. What we mean here are errors that arise from distraction of the speaker like false starts, repetitions, stuttering or sentence merging as in heute geht es bei dir also heute also bei mir geht es heute nicht [today it is possible for you so today oh for you so for me it is impossible today]. Combining a speech recognizer with a commercial translation system makes these problems very apparent. Consider for example the spoken utterance da geht es bei mir wieder leider nicht dann bin ich nämlich in Münster ich könnte dann wieder ab 28. Mai taken from the Verbmobil corpus. If we segment this by hand and give each segment to the translation system the output is Again unfortunately, it doesn't go with me there. I then am namely in Münster. I then could as of 28 May again. which is not very good English but somehow understandable (problem 1). Without the segmentation the quality of the translation decreases drastically (problem 2): //geh// there again unfortunately, I am not it with me then namely in Münster I could then again as of 28 May. Things get even worse if we have the system translate the most probable string produced by the speech recognizer da geht es bei mir weder leider nicht dann bin ich nehme ich in München ich könnte wenn wieder ab 28. Mai (problem 4): //geh// with me there //weder// unfortunately, I am not then relieve I I could in Munich if again as of 28 May. Similar experience can be made with problem 3 and especially problem 5. 2. THE VERBMOBIL SOLUTIONS The solutions explored in the Verbmobil project mirror to a certain extend the problem groups. We have tried to solve problem 1 by the development of a grammar for spoken German. Problems 2 and 3 are attacked by a substantial integration of prosody recognition and processing. We tried to handle problem 4 by the use of a word hypothesis graph (word lattice) and a linguistic search routine and problem 5 by a „shallow“ robust secondary analysis and translation component that combines techniques from speech act detection and information extraction.
منابع مشابه
The statistical approach to spoken language translation
This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition, we show how the required probability distrib...
متن کاملA Multi - Dimensional Representation ofContext
Das diesem Bericht zugrundeliegende Forschungsvorhaben wurde mit Mitteln des Bundesministers f ur Forschung und Technologie unter dem FF orderkenn-zeichen 01 IV 101 K/1 geff ordert. Die Verantwortung f ur den Inhalt dieser Arbeit liegt bei den Autoren. Abstract In this paper we show how the notion of context has been reened in order to fullll the requirements posed by a natural language process...
متن کاملMobile Speech-to-Speech Translation of Spontaneous Dialogs: An Overview of the Final Verbmobil System
Verbmobil is a speaker-independent and bidirectional speech-to-speech translation system for spontaneous dialogs in mobile situations. It recognizes spoken input, analyses and translates it, and finally utters the translation. The multilingual system handles dialogs in three business-oriented domains, with context-sensitive translation between three languages (German, English, and Japanese). Si...
متن کاملThe RWTH System for Statistical Translation of Spoken Dialogues
This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the Verbmobil project. The goal of the Verbmobil project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition, we show how the required probability distrib...
متن کاملThe VERBMOBIL Treebanks
The Verbmobil treebanks of spoken German, English, and Japanese are part of the Verbmobil project, which has the overriding goal to develop a speaker-independent system for the translation of spontaneous speech. In the framework of this language technology project, the treebanks provide training data for a variety of language technology modules. The treebanks consist of annotated syntactic tree...
متن کاملMultilingual Speech Recognition
The speech-to-speech translation system Verbmobil requires a multilingual setting. This consists of recognition engines in the three languages German, English and Japanese that run in one common framework together with a language identification component which is able to switch between these recognizers. This article describes the challenges of multilingual speech recognition and presents diffe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997